Objects and saliency: reply to Borji et al.
نویسنده
چکیده
(a) The authors’ reanalysis of our data with respect to the ITTI* model yields very similar results numerically, so there is no disagreement on the results per se. (b) We acknowledge that more recent ‘‘early saliency’’ models outperform our object-based approach. It is, however, no surprise that our naı̈ve object model cannot live up to this comparison. For example, it is well established that fixations do not target objects uniformly (e.g., Nuthmann & Henderson, 2010), but have a tendency towards the object’s center. Taking this bias into account is therefore likely to improve object-based models. In our paper we deliberately chose the simplest possible object model (uniform distribution of fixations on the object) to show that even such a simple assumption outperformed the then widely used ITTI* model. How object-based models that use more realistic fixation distributions within an object fare against the best models used in Borji et al.’s (2013) comment is an interesting question for future research. (c) The improvement of the ITTI* model under the smoothing condition is consistent with the aforementioned distribution of fixations within objects, as smoothing tends to move high values of early saliency models from object edges towards their centers. (d) When comparing ITTI* to our object-based approach (e.g., with the ‘‘sAUC’’), the used Bonferroni correction could in principle greatly increase the probability of a type II error (erroneously failing to reject the null hypothesis stated in the comment’s title). Although this does not seem to be the case here (personal communication with the authors and table in their supplemental material), absence of significance of course does not imply no effect, especially given the comparably small number of participants in our original study.
منابع مشابه
Supplement to: Objects do not predict fixations better than early saliency; A re-analysis of Einhäuser et al.’s data
We use three prevalent scoring methods to test the object-map hypothesis, since validity of this model boils down to fair model comparison with bottom-up saliency models. Please see Appendix B for explanation of state-of-the-art saliency models used here. We report results using the previously proposed Normalized Scanpath Saliency (NSS) (Peters et al., 2005; Parkhurst et al., 2002), Correlation...
متن کاملObjects do not predict fixations better than early saliency: a re-analysis of Einhauser et al.'s data.
Einhäuser, Spain, and Perona (2008) explored an alternative hypothesis to saliency maps (i.e., spatial image outliers) and claimed that "objects predict fixations better than early saliency." To test their hypothesis, they measured eye movements of human observers while they inspected 93 photographs of common natural scenes (Uncommon Places dataset by Shore, Tillman, & Schmidt-Wulen 2004; Suppl...
متن کاملComputational models of attention
A large body of psychophysical evidence on attention can be summarized by postulating two forms of visual attention (James, 1890/1981). The first is driven by the visual input; this so-called exogenous, bottom-up, stimulus-driven, or saliency-based form of attention is rapid, operates in parallel throughout the entire visual field, and helps mediate pop-out, the phenomenon by which some visual ...
متن کاملWhere Should Saliency Models Look Next?
Recently, large breakthroughs have been observed in saliency modeling. The top scores on saliency benchmarks have become dominated by neural network models of saliency, and some evaluation scores have begun to saturate. Large jumps in performance relative to previous models can be found across datasets, image types, and evaluation metrics. Have saliency models begun to converge on human perform...
متن کاملDistinct Class Saliency Maps for Multiple Object Images
This paper proposes a method to obtain more distinct class saliency maps than Simonyan et al. (2014). We made three improvements over their method: (1) using CNN derivatives with respect to feature maps of the intermediate convolutional layers with up-sampling instead of an input image; (2) subtracting saliency maps of the other classes from saliency maps of the target class to differentiate ta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of vision
دوره 13 10 شماره
صفحات -
تاریخ انتشار 2013